Stronger Baselines for Grammatical Error Correction Using a Pretrained Encoder-Decoder Model

نویسندگان

چکیده

Studies on grammatical error correction (GEC) have reported the effectiveness of pretraining a Seq2Seq model with large amount pseudodata. However, this approach requires time-consuming for GEC because size In study, we explore utility bidirectional and auto-regressive transformers (BART) as generic pretrained encoder-decoder GEC. With use GEC, can be eliminated. We find that monolingual multilingual BART models achieve high performance in one results being comparable to current strong English Our implementations are publicly available at GitHub (https://github.com/Katsumata420/generic-pretrained-GEC).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction

We improve automatic correction of grammatical, orthographic, and collocation errors in text using a multilayer convolutional encoder-decoder neural network. The network is initialized with embeddings that make use of character Ngram information to better suit this task. When evaluated on common benchmark test data sets (CoNLL-2014 and JFLEG), our model substantially outperforms all prior neura...

متن کامل

A Beam-Search Decoder for Grammatical Error Correction

We present a novel beam-search decoder for grammatical error correction. The decoder iteratively generates new hypothesis corrections from current hypotheses and scores them based on features of grammatical correctness and fluency. These features include scores from discriminative classifiers for specific error categories, such as articles and prepositions. Unlike all previous approaches, our m...

متن کامل

A Hybrid Model For Grammatical Error Correction

This paper presents a hybrid model for the CoNLL-2013 shared task which focuses on the problem of grammatical error correction. This year’s task includes determiner, preposition, noun number, verb form, and subject-verb agreement errors which is more comprehensive than previous error correction tasks. We correct these five types of errors in different modules where either machine learning based...

متن کامل

Generating a Training Corpus for OCR Post-Correction Using Encoder-Decoder Model

In this paper we present a novel approach to the automatic correction of OCR-induced orthographic errors in a given text. While current systems depend heavily on large training corpora or external information, such as domain-specific lexicons or confidence scores from the OCR process, our system only requires a small amount of relatively clean training data from a representative corpus to learn...

متن کامل

Simplification of the encoder-decoder circuit for a perfect five-qubit error correction

Simpler networks of encoding and decoding are necessary for more reliable quantum error correcting codes (QECCs). The simplification of the encoder-decoder circuit for a perfect five-qubit QECC can be derived analytically if the QECC is converted from its equivalent one-way entanglement purification protocol (1-EPP). In this work, the analytical method to simplify the encoder-decoder circuit is...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Shizen gengo shori

سال: 2021

ISSN: ['1340-7619', '2185-8314']

DOI: https://doi.org/10.5715/jnlp.28.276